MAIME: A Maintenance Manager for ETL Processes

نویسندگان

  • Darius Butkevicius
  • Philipp D. Freiberger
  • Frederik M. Halberg
چکیده

The proliferation of business intelligence applications moves most organizations into an era where data becomes an essential part of the success factors. More and more business focus has thus been added to the integration and processing of data in the enterprise environment. Developing and maintaining Extraction-Transform-Load (ETL) processes becomes critical in most data-driven organizations. External Data Sources (EDSs) often change their schema which potentially leaves the ETL processes that extract data from those EDSs invalid. Repairing these ETL processes is time-consuming and tedious. As a remedy, we propose MAIME as a tool to (semi-)automatically maintain ETL processes. MAIME works with SQL Server Integration Services (SSIS) and uses a graph model as a layer of abstraction on top of SSIS Data Flow tasks (ETL processes). We introduce a graph alteration algorithm which propagates detected EDS schema changes through the graph. Modifications done to a graph are directly applied to the underlying ETL process. It can be configured how MAIME handles EDS schema changes for different SSIS transformations. For the considered set of transformations, MAIME can maintain SSIS Data Flow tasks (semi-)automatically. Compared to doing this manually, the amount of user inputs is decreased by a factor of 9.5 and the spent time is reduced by a factor of 9.8 in an evaluation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A BPMN-Based Design and Maintenance Framework for ETL Processes

Business Intelligence (BI) applications require the design, implementation, and maintenance of processes that extract, transform, and load suitable data for analysis. The development of these processes (known as ETL) is an inherently complex problem that is typically costly and time consuming. In a previous work, we have proposed a vendor-independent language for reducing the design complexity ...

متن کامل

Towards a Benchmark for ETL Workflows

Extraction–Transform–Load (ETL) processes comprise complex data workflows, which are responsible for the maintenance of a Data Warehouse. Their practical importance is denoted by the fact that a plethora of ETL tools currently constitutes a multi-million dollars market. However, each one of them follows a different design and modeling technique and internal language. So far, the research commun...

متن کامل

IT4BI Master Thesis Representing ETL Flows with BPMN 2.0

Extract, Transform and Load (ETL) processes are widely used in Data Warehousing in order to extract, cleanse and load data into a centralized location for better analysis and decision-making. As users become more demanding for on-line decision making, ETL processes grow large and more complex. Most processes are deployed at the physical level without any abstraction, thus costs of maintenance a...

متن کامل

Benchmarking ETL Workflows

Extraction–Transform–Load (ETL) processes comprise complex data workflows, which are responsible for the maintenance of a Data Warehouse. A plethora of ETL tools is currently available constituting a multi-million dollar market. Each ETL tool uses its own technique for the design and implementation of an ETL workflow, making the task of assessing ETL tools extremely difficult. In this paper, we...

متن کامل

Improving the ETL process and maintenance of Higher Education Information System Data Warehouse

HEIS (Higher Education Information System) is a project funded by the Croatian Ministry of Science, Education and Sports started in the year 2001. HEIS is a comprehensive information system that provides support for education related processes taking place within a higher education institution. As a part of the project, a data warehouse was developed to provide reporting and analytical features...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017